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Abstract 

Spectral- Spatial classification of hyperspectral image 
suffers from two problems: the existence of various 
feature extraction methods that complicate the choice of 
those applying and the availability of limited number of 
labeled training samples. To overcome these difficulties, 
this paper presents new spectral-spatial classification 
approach for remotely sensed hyperspectral image which 
integrate different spectral and spatial features via multi- 
feature kernels and process accurately with limited 
number of training samples. In fact, the proposed method 
introduces different methods to extract the spectral and 
the spatial features and exploits the oversampling based 
on interpolation techniques to generate new labeled 
samples. First, each pixel must be characterized by two 
spectral vectors computed according to the application of 
the principal components analyse and the independent 
components analyse and three spatial features calculated 
by using three methods: the average of neighbourhood 
pixels, the textural features and the extended multi- 
attribute profiles. Then an oversampling step is 
introduced to create new labeled samples used to train the 
classifier. Finally, a support vector machine (SVM) with 
multi-feature kernel is efficiently trained to generate the 
classification map. The proposed classification approach 
is experimentally evaluated using the AVIRIS Indian 
Pines data set, exhibiting higher performance when 
compared with the multi-feature classification without 
oversampling. 

Keywords: Hyperspectral images, SVM, multi-feature 
kernels, interpolation techniques. 

Nomenclature: 

AA Average Accuracy 

AP Attribute Profile 

ASM Angular Second Moment 

EAP Extended Attribute Profile 


ENT Entropy 

EMAP Extended Multi-Attribute Profile 

GLCM Gray Level Co-Occurrence Matrix 

ICA Independent Components Analyses 

k Kappa coefficient 

LM Local Mean 

MLR Multinomial Logistic Regression 

MP Morphological Profiles 

OA Overall Accuracy 

PC A Principal Components Analyses 

SVM Support Vector Machine 

Var Variance 


1. Introduction 

Recent advances on remote sensing provide images with 
high spectral and spatial resolution. Hyperspectral image 
presents the captured scene in hundreds of narrow 
contiguous bands spanning the visible-to-infrared 
spectrum. For that, hyperspectral data are used in a 
diverse applications such as agriculture [1], astronomy 
[2], surveillance [3] and environmental sciences [4]. 
These application are based on the classification of each 
pixel in hyperspectral imagery. The objective of the 
classification is to assign each pixel to one of the classes, 
based on its spectral and spatial characteristics. Then, the 
exploitation of the highly informative spectral and spatial 
information of hyperspectral image pixels improves the 
accuracy of the classification. Nevertheless, the 
complexity and the high dimensionality of hyperspectral 
data complicate the classification, thus this technique is a 
challenging task. 

In the last decades, many discriminative classification 
approaches have been developed. Among these, the 
SVM [5] and MLR [6]— [4] have demonstrated to be very 
powerful. In particular, SVM has shown good 
performances for classifying high- dimensional data [7]. 
For that various spectral-spatial classification methods 
based on SVM have been presented in literature. 
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Composite kernels [8], which combine spectral and 
spatial kernels have been used to assure an accurate 
classification. The uses of stocked vector that concatenate 
spectral and contextual information extracted by MP has 
shown a significant improvements [9]. Edge-preserving 
filters have been exploited to develop an accurate 
spectral-spatial classification outperforming the 
classification without filtering [10]. A multi-feature 
model aiming at constructing a SVM set combining 
multiple spectral and spatial features [11] has registered 
an accurate classification. A generalized composite 
kernels have been introduced to improve the performance 
of the classification [12]. Segmentation techniques have 
been investigated in [13]. A multi- feature kernels [14], 
which combine different type of kernels: kernel for each 
feature, have noted a relevant classification. 

The literature review showed that all the studies 
emphasized the importance of the spatial information in 
the hyperspectral image classification. However, the 
availability of various spectral and spatial 
characterization methods (e.g. PCA [15], ICA [16], 
morphological features [17], wavelet-based texture [18], 
GLCM [19]) complicate the selection of the used 
methods. 

Another difficulty has been discussed in the literature, the 
Hughes phenomenon referred to the high dimensionality 
of the hyperspectral data and the availability of a limited 
number of training samples. Then different solutions 
have been proposed to solve this problem. Among these 
we note: the application of features selection [20] and 
extraction methods to reduce the dimensionality of data 
and the uses of semi- supervised learning techniques to 
develop a semi-supervised classification approach based 
on the augmentation of the number of training samples 
from the set of unlabeled pixels. Synthetic data has been 
investigated in [21] to increase the set of labeled samples 
by oversampling, which generate new samples by means 
of interpolation techniques. 

In this context, we propose a multi-feature spectral- 
spatial classification approach based on oversampling 
aims to solve the problem of the limited number of 
training samples and to overcome the difficulty of the 
choice of the adopted characterization methods. The 
proposed approach implements the following three main 
steps: 1) spectral and spatial characterization step that 
introduces different methods to extract the spectral and 
the spatial features, 2) oversampling which exploits 
interpolation techniques to create new labeled examples 
and 3) classification step based on the use of SVM with 
multi-feature kernels that combine different type of 
kernels: kernel for each feature. 

The remainder of this paper is organized as follows. 
Section 2 describes the proposed approach. Section 3 
reports classification results based on real hyperspectral 
data sets. Finally, Section 4 concludes with some remarks. 

2. Proposed approach 

The goal of the proposed approach is to have an accurate 
SVM classification dealing with these two problems: 

- the existence of many features extraction methods and 
the difficulty of the choice of the suitable method, 

- the limited number of training samples. 


For that, we propose to apply an oversampling step 
presented in [21] to increase the number of labeled pixels 
and to use the multi-feature kernels to combine different 
attributes resulted from the application of various 
spectral and spatial feature extraction techniques. 

The proposed approach implements the following three 
main steps: 1) spectral and spatial characterization step 
that introduces different methods to extract the spectral 
and the spatial features, 2) oversampling step that 
exploits interpolation techniques to create new labeled 
examples and 3) classification step based on the use of 
SVM with multi-feature kernels. 

2.1. Spectral and spatial characterization 
The wealthy spectral and spatial information available in 
hyperspectral images allows for the possibility to 
distinguish between spectrally similar materials. Various 
methods have been widely used in the literature for 
spectral and spatial characterizing hyperspectral pixels. 
For the spectral characterization, authors usually used all 
the spectral information or dimensionality reduction 
techniques like PCA and ICA to extract the most 
informative data. For the spatial features extraction, 
different means have been adopted such as: features 
provided from the neighborhood of the pixel, attribute 
filters and textural features. 

In this paper, we focus on the uses of PCA and ICA for 
the spectral characterization and the mean of 
neighborhood pixels, EMAP based on attribute filters 
and textural features for the spatial characterization. 

• PCA: is a statistical procedure that uses an orthogonal 
transformation to convert a set of observations of 
possibly correlated variables into a set of values of 
linearly uncorrelated variables called principal 
components. Then, PCA aims to remove the 
correlation among the bands. In the process, the 
optimum linear combination of the original bands 
accounting for the variation of pixel values in an 
image is identified. 

• ICA: is a popular approach to blind source 

separation, it has been investigated in the analyse of 
hyperspectral images to remove the dependence 
between bands. 

• Average of neighbourhood pixels: this spatial 

characterization technique explains each pixel (p) in 
terms of its neighborhood (pk) in a window (i*j) by 
calculating the average of their spectral information 
X(pk). It return X avg (Equation (1)). 

Xav g (p)=4rZ X (Pk) (!) 

1 J k=l 

• Textural features [22]: emphasize the texture 

structure of the graylevel image. They are local 
indexes computed by means of sliding windows of 
size P x Q. For hyperspectral image, these metrics 
can be found by adopting the panchromatic band, the 
first principal component or a discriminative band. 
Among these features we can note: 

- Focal mean (FM): is computed on the graylevel 
values contained in the sliding window centered on 
the pixel Xij. It return a local texture value Xij LM 
(Equation (2)). 
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^ LM =i I 


PQ, 


( 2 ) 


where w denotes the pixels contained in the window 
centered on Xy. 

- Variance (Var): It return a local texture value Xij Var 
(Equation (3)) 


r= iZ( X pq- } 


PQ P " 


( 3 ) 


where w denotes the pixels contained in the window 
centered on xy and Xij LM the local mean of the 
considered pixel. 

- Entropy (ENT): This measure compute the intensity 
of the texture in the considered image. It is based on 
the GLCM, that represents the relative occurrence 
frequency p(m, n) of two graylevel intensities m and 
n in the P x Q window at a given angular 
neighbourhood (Equation (4)). 

xy^-IIpdvn 2 )k)gp(n 1 ,n 2 ) (4) 


-Angular second moment (ASM): It indicates the 
local contrast, providing an accurate estimate on the 
degree of uniformity of the values of the GLCM 
(Equation (5)). 

Xij ASM = IIp(n „ n 2 ) 2 (5) 


• EMAP [23] is a profile that stacked the EAPs 
obtained using different type of attributes. The EAP is 
resulted by generating an AP (obtained by applying a 
sequence of attribute filters using various thresholds) 
on each of the first p principal components. 


2.2. Oversampling 

To increase the number of training samples, we 
implemented the oversampling algorithm (Algorithm 1) 
[21]. The goal of this algorithm is to generate from the t 
feature vectors yi of dimension dim presenting the set of 
training samples (Y tram =(yi, ...,y t )) g new feature vectors 
presenting the set of new training samples (Y new =(y new i, 
..., y new g )) by means of interpolation techniques. In fact, 
three interpolation methods have been used: linear 
interpolation, cubic spline interpolation and Lagrange 
interpolation. 


Algorithm 1: Oversampling 

READ T rain 

FOR each row of the matrix T rain i (For i=l to dim) 
Present each value of the row i by a point. 

Compute the interpolation function. 

Generate new samples abscissas. 

Compute new training samples according to the 
evaluation of the interpolation function f in new 
abscissas. 

Save new values in Y new . 

ENDFOR 
PRINT Y new 



Figure 1: Flowchart of oversampling 

3.3. Classification via SVM with multi-feature kernels 
SVMs have been widely adopted in the classification of 
hyperspectral images due to their high performance 
registered on the process of data with high 
dimensionality. SVM [24] is a kernel based classifier 
consisting in projecting data in a higher dimension space 
by means of non-linear mapping function ® and aiming 
at finding the optimal separator hyperplan by margin 
maximization. SVM has been proposed first for binary 
classification, after it has been introduced to solve multi- 
class classification. 

In order to improve the classification performance 
achieved by using the spectral information alone, various 
spectral-spatial classification approaches incorporating 
the spatial information in addition to the spectral 
information have been proposed in the literature. In 
particular, the uses of SVM with kernels that combining 
different type of kernels like composite kernels [8] and 
multi-feature kernels [14] has shown high performance 
in term of accuracy. 

In this paper, each pixel must be characterized by two 
spectral vectors x PCA and x ICA resulted respectively from 
the application of PCA et ICA and tree spatial vectors 
neighborhood^ x Texture anc [ x emap com p U ted respectively after 

the implementation of the mean of neighborhood pixels, 
textural features and EMAP. For that, we implemented 
tree different multi-feature kernels which combining 
these different spectral and spatial attributes: 


• Kernel 1 (Equation (6)): 


Figure 1 shows the flowchart of the oversampling step. 


jj- PCA + ICA + Neigh +Text+ EMAP / ^ ( y PCA Y ra U 

JX \x i , Xj ) — Jx PCA tV ? X j ) + 

K (x ICA x ica ) + K (x 

iV ICA V V ’ / / ^ iV Neigh V / 


m g h, x Nei g k^ + K ^ x Tet,y, 


K EMAP ( V 


EMAP EMAP 

, X j 


) + 
(6) 
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• Kernel 2 (Equation (7)): 

T^PCA + ICA + Neigh + Text + EMAPs \ 

/iiV fE , X ■ ) — 


A x [k pca (. xf CA , X ™ ) + K ica (. x' CA , xf A )] + (1 - ft) X 

[ K Neigh ( X r sh , ) + K re X , or > x T ) + 

K EMAP (xf MAP ,x^ MAP )] ( 7 ) 

with 0 < /J < 1 . 

• Kernel 3 (Equation (8)): 

^ PCA + ICA + Neigh + Text + EMAP X-) 


«1 x K pca (x pca , X ' CA ) + a 2 X K ica (. x' CA , xf A ) + a 3 X 
K Neigh (xr sh , *j eigh ) + «4 x K Texl (,f- , X™ ) + a 5 x 

K. 


Ax! 


EMAP EMAP > 


,x. 


( 8 ) 


where ^ a i = 1 • 

i=l 

To summarize the description of our proposed 
classification method, Algorithm 2 provides a 
pseudocode for our newly developed spectral spatial 
classification algorithm based on a SVM classifier with 

multi- feature kernels and oversampling. 

Algorithm 2: SVMoversampling 


READ Y = (yj , y n ) // Pixels of the hyperspectral 


image. 

READ T //Set of training samples. 

// Spectral characterization 

Y PCA = PCA(Y) //Compute the PCA of each pixels. 

T p cA = [T* CA , ..., T c PCA 7 //compute PCA of labeled 
pixels. 

Y ica = ICA(Y) //Compute the ICA of each pixels. 

T ICA = /’T | 1CA , T c ICA 7 //compute ICA of learning 
pixels. 

// Spatial characterization 

yNeigh _ Neigh(Y) //Calculate for each pixel the average 
of neighborhood pixels. 

jNeigh _ j-j .Neigh ^ jNeigh y //Average of neighborhood 


pixels of training samples. 

Y EMAP = EMAP(Y) //Compute the EMAP of each pixels. 

rpEMAP y-yEMAP yEMAPy. 


yText _ p eX {/Y) //Compute the textural features the of 
each pixels. 



FOR each class i 

YfZ=oversampling(Tr) 

Y i C n t -oversampling(T l p ) 

=oversainpIing(Yf v ~ h ) 

x 7 EMAP /. /np EMAP v 

Y- new =oversamphng(l] ) 

Y Text 7. /i-pText \ 

i new =oversampling( 1, ) 

II Spectral and spatial features of training samples after 
oversampling. 


T pca _r '■pPCA V PCA t 

i / E 9 ^ i new J 

T ICA _fT ICA y ICA 7 

1 / 1 i 9 E new J 

yNeigh _y yNeigh yNeigh y 
'-pEMAP _r nr EMAP x^EMAP , 

E / 1 i 9 E new J 

i-pText prpText YText 7 

E ~L E 9 E new J 

// Features of training samples in all the classes. 

rpPCA rpPCA yPCAy 

y ICA _j- rj-fjCA y ICA y 

yNeigh _y jfJeigh yNeigh y 

yEMAP _j- rpEMAP yEMAP J 

yText =/" TText yText j 
A new L 1 > 1 J 

ENDFOR 

L= SVM Classification ( Y PCA , Y ICA , Y Neigh , Y EMAP , Y Text , 


n Neigh 


Vj: ) // SVM 


classification with multi-feature kernels. 
PRINT L // Labels of each pixel. 


Figure 2 illustrate the flowchart of algorithm 2. 

The architecture of the proposed approach is illustrated 
in figure 3. 



Figure 2: The flowchart of algorithm 2. 
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Figure 3: Architecture of the proposed approach. 


3. Experimental results 

To evaluate the performance of the proposed approach, 
we classified the widely used hyperspectral image " 
Indian Pines". It contains 145 x 145 pixels and 200 
spectral bands. The ground-truth data contains 16 classes 
and a total of 10366 labeled pixels. The classification of 
this image is a challenging task refer to the significant 
presence of classes with similar spectral signatures and 
also because of the unbalanced number of available 
labeled pixels per class. 

In all these experiments, we will use the classification 
accuracy (OA) and kappa coefficient (Kappa) as a 
references to evaluate the performance of the proposed 
classification approach. 

For SVM classification, we implemented the most used 
multi-class classification strategy "one against all" and 
we used RBF and polynomial kernels for the spectral and 
spatial features, respectively, to construct multi-feature 
kernels. The training sets are randomly selected from the 
available labeled samples and that the remaining samples 
are used for validation. We optimized the SVM 
parameters using tenfold cross-validation. 

After the spectral and the spatial characterization, each 
pixel has been presented by five vectors: 

- x PCA is a spectral vectors that contains the first five 
principals components. 

- x ICA is a spectral vector containing the six independents 
components. 

_ x Nei g h j s a S p a ti a i vector that contains the average of 
neighborhood pixels in a window of size 3x3. 

- x EMAP a spatial vector that contains the EMAP of each 
pixel. EMAP were built according to the used attributes 
and thresholds presented in [25]: threshold values in the 
range of 2,5% - 10% with a step of 2,5% for the standard 


deviation attribute and thresholds of 200, 500 and 1000 
for the area attribute. 

- x Text is a spatial vector that contains the textural features 
computed from the first three principals components. 
Note that we applied tree sliding windows: 3x3, 9x9 and 
15x15 and we used four directions to calculate these 
features (LM, VAR, ASM and ENT). 

Must indicate that the number of the used principals 
components in x PCA and the number of the adopted 
independents components for x ICA have been 
experimentally fixed according to the spectral 
classification of the image when using 10 labeled 
samples in each class (Figure 4). 

71 
70 
69 

OA 68 
67 
66 
65 

3456789 10 

Number of the used principals components 



(a) 



(b) 

Figure 4. OA variation according to the variation of the number of 
principal components (a) and the number of independent components 
(b). 
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In order to analyze the impact of the multi-feature kernel 
and to select the most suitable for the classification of the 
"Indian Pines" data set, we analyze the performance of 
the proposed method for the tree kernels: 

j^PC A+IC A+N eigh+T ext+EMAP ^j^PCA+ICA+Neigh+Text+EMAP an j 

a j^pcA+icA+Nei g h+Text+EMAP j n a classification with 
generating in each class 10 samples about 10 available 
training samples. Figure 5 shows the OAs and kappa 
obtained by the proposed classification algorithm 
according to the applied kernels. Note that we used 
(1=0.25 for ^j^PCA+ICA+Neigh+Text+EMAP a nd 

(0.1,0. 1,0.2, 0.2, 0.4) for a K PCA+ICA+Nei g h+Text+EMAP . This 
chose has been fixed experimentally (Figure 6). As 
shown in Figure 3, the performance of the proposed 
classification algorithm increases when using 
a j^pcA+icA+Neigh+Text+EMAP w hi c h valorize the spatial 

features extracted by EMAP (as=0.4 > a3=a4=0.2 > ai= 
a2=0.1), it yielded much better OA and kappa results 
(OA= 82% and kappa=81,73%). Furthermore, we can 
note that the weighted summation kernel introducing a 
trade-off (ji) between spectral and spatial kernels with 
|i=0.25 performs more accurately than the direct 
summation kernel. 



Figure 5: Resulted OA and Kappa coefficient according to the 
adopted multi- feature kernel 



(a) 



(ai,a2,a-3,a4,a5) 

(b) 

Figure 6. Variation of OA according to the variation of g for the 
kernel ^K PCA+ICA+Nei g h+Text+EMAP (a) and (*,02,03,04,05) for the 
kernel a K PCA+ICA+Nei g h+Text+EMAP (b). 


To show the advantage of the oversampling, we note in 
Table I the classification results obtained for different 
number of training samples after oversampling and 
without oversampling. In this experiment, we used cubic 
spline interpolation for the spectral and the spatial 
features to create new labeled samples and we adopted 
uK p cA + icA+Neigh+EMAP+Text as a mu lti-feature kernel. 

Table I illustrate the average of the OA followed by the 
standard deviation ( ± ) and the kappa coefficient 
obtained after ten Monte Carlo runs. By adopting 
oversampling, the proposed method significantly 
improved the classification results obtained by the 
considered classification without oversampling for all the 
adopted size of training set (10, 20 and 30). For instance, 
the generation of 40 samples about 10 labeled examples 
obtained an OA of 85.09%, 6.09% larger than that 
obtained by SVM without oversampling. As a result, the 
obtained samples after oversampling improves the 
accuracy of the supervised classifier (SVMs with multi- 
feature kernel). Notice also that the increase in the 
number of generated data improves significantly the 
performance of the classification (Figure 7) which 
indicates the advantage these samples that increase the 
ability of SVM to find the optimal separator hyperplan. 

Table 1: OA and kappa coefficient (in parenthesis) 
obtained for the Indian Pines data set 


Labeled 

samples 

Number of generated samples 

0 

10 

20 

30 

40 

10 

79% 

± 1.3 
(0.789) 

81.32% 
± 1.25 
(0.81) 

82.5% 
± 1.4 
(0.821) 

84.04% 
± 0.54 
(0.83) 

85.09% 
± 0.64 
(0.849) 

20 

85.56 

±1.2 

(0.86) 

87.86 
± 1.11 
(0.878) 

88.39% 

±0.66 

(0.89) 

89.54% 
± 0.67 
(0.9) 

90.34% 

±1.1 

(0.91) 

30 

87.68 
± 0.92 
(0.85) 

89.34% 

±1.07 

(0.896) 

90.19% 

±1.15 

(0.907) 

91.3% 
± 1.1 
(0.92) 

92.11% 
± 0.56 
(0.93) 



Generated set size 

(a) 


20 BO 

Generated s et size 


^ _ . 10 labeled samples 
| - 20 label e d s ampl es 
^ — 30 labeled samples 



labeled samples 
- | - 20 labeled samples 
A 30 labeled samples 


(b) 

Figure 7: Variation of OA (a) and kappa coefficient (b) according to the 
increase in generated set size. 
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Focusing on the oversampling step, we have indicated in 
the description of the proposed classification method that 
we have implemented tree interpolation techniques to 
generate new training data: linear interpolation, cubic 
spline interpolation and Lagrange interpolation. Then to 
analyze the impact of these techniques and to select the 
most adequate for each feature, we note in table 2 the 
spectral, the spatial and the spectral- spatial classification 
results obtained after the generation of 20 data from 10 
labeled samples by applying different interpolation 
methods. Table 2, illustrates for each interpolation 
method the OA, AA, kappa statistic coefficient (k), and 
individual class accuracy (in percent) results achieved by 
the spectral classification ( two spectral classifications: 
using characteristics extracted by PCA and characteristics 
extracted by ICA), the spatial classification (tree spatial 
classifications: using characteristics extracted by the 
average of neighborhoof pixels, textural features and 
characteristics extracted by EMAP) and the spectral- 
spatial classification assured by the proposed method. It 
is remarkable that the uses of Lagrange interpolation to 
create new spectral features (PCA, ICA) and spatial 


features extracted by EMAP and the average of 
neighborhood pixels leads to have more accurate spectral 
and spatial classifications than that resulted after the 
application of the other methods: cubic spline and linear 
interpolation. For textural features, it's clear that the 
linear interpolation provided the highest accuracy. Then, 
for the spectral-spatial classification we applied the 
Lagrange interpolation for feature extracted by PCA, 
ICA, Neigh, and EMAP and the linear interpolation for 
the textural features. This combination provided high 
performance, it obtained an OA of 83.87% and kappa 
coefficient of 0,8385, 1.37% and 0.017 larger than these 
obtained when we used cubic spline interpolation for all 
features (indicated in table 1). This indicate that the 
samples generated by this combination are properly 
created refer to their similarity to the oversampled data in 
each class. 

Figure 8 shows the ground truth and the classification 
result obtained without-oversampling and by the 
proposed method for the AVIRIS Indian Pines scene. The 
advantage of the proposed classification approach is 
clearly appreciable in this figure. 


Table 2: Table 2: OA, AA an kappa coefficient obtained for the AVIRIS Indian Pines data set 


Class 

Test 

samples 

Classification with oversampling 















Spectral classification 


Spatial classification 



ACP 

ACI 

EMAP 



Lin 

Spl 

Lag 

Lin 

Spl 

Lag 

Lin 

Spl 

Lag 

Alfalfa 

44 

77,27 

79,55 

79,55 

86,36 

86,36 

79,55 

86,36 

86,36 

86,36 

Com-notill 

1424 

22,33 

8,64 

15,45 

27,46 

21,35 

23,88 

30,34 

30,34 

30,68 

Com-mintill 

824 

22,21 

21 

38,47 

34,10 

28,40 

43,45 

34,10 

27,55 

27,55 

Com 

224 

72,32 

70,54 

82,59 

68,30 

65,18 

73,21 

41,07 

49,55 

49,55 

Grass/pasture 

487 

68,38 

65,50 

58,11 

84,80 

82,75 

75,15 

82,75 

82,75 

82,75 

Grass/trees 

737 

81,68 

84,12 

86,43 

83,18 

88,60 

87,38 

87,89 

87,89 

85,89 

Grass/pasture- 

mowed 

16 

81,25 

87,50 

87,50 

68,75 

87,50 

87,50 

93,75 

93,75 

93,75 

Hay-windrowed 

479 

88,10 

89,14 

81 

66,39 

68,89 

89,77 

91,65 

91,65 

91,65 

Oats 

10 

90 

90 

100 

80 

90 

100 

100 

90 

100 

Soybean-no till 

958 

43,63 

46,56 

34,76 

15,76 

48,33 

13,47 

48,15 

48,15 

55,5 

Soybean-min till 

2458 

51,51 

31,08 

44,26 

48,41 

8,01 

57,28 

48,70 

48,70 

48,70 

Soybean-clean till 

604 

43,21 

36,26 

44,87 

45,20 

26,16 

64,57 

66,23 

66,23 

66,23 

Wheat 

202 

63,37 

74,75 

79,70 

52,97 

56,93 

68,81 

95,05 

95,05 

95,05 

Woods 

1284 

28,27 

47,51 

47,90 

47,66 

52,65 

58,18 

75,86 

75,86 

75,86 

Bldg-Grass- 

Trees-Drives 

370 

40 

62,16 

50,81 

18,92 

26,22 

20,54 

62,16 

71,35 

71,35 

Stone- Steel- 
Towers 

85 

91,76 

91,76 

97,65 

90,59 

91,76 

91,76 

91,76 

97,65 

97,65 

OA (%) 


69,59 

68,84 

70,59 

68,43 

66,61 

71,44 

72,11 

72,98 

74,02 

AA (%) 


61,63 

61,63 

64,32 

57,43 

58,07 

64,66 

69,82 

70,67 

71,15 

kappa 


0,6026 

0,5872 

0,6089 

0,6152 

0,6181 

0.67 

0,721 

0,721 

0,721 
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Table 2 (suite) 


Class 

Test 

sam 

pies 

Classification with oversampling 

Spatial classification 

Multi-feature classification 

Neighborhood 

Textural features 

Lin 

Spl 

Lag 

Lin 

Spl 

Lag 

Without 

oversampling 

Lag_Lag_Lag_ 
Lag Lin 

Alfalfa 

44 

88,64 

86,36 

86,36 

84,09 

88,64 

84,09 

90,91 

93,18 

Com-notill 

1424 

19,87 

21,14 

22,68 

9,27 

13,41 

10,60 

30,34 

38,97 

Com-mintill 

824 

28,28 

31,92 

27,55 

16,38 

17,23 

24,51 

34,59 

57,28 

Com 

224 

45,54 

41,52 

49,55 

41,07 

41,07 

37,05 

82,59 

87,05 

Grass/pasture 

487 

47,23 

45,38 

60,57 

44,35 

40,66 

35,93 

74,33 

82,96 

Grass/trees 

737 

82,77 

81,82 

85,89 

31,34 

31,48 

27,68 

89,69 

93,49 

Grass/pasture-mowed 

16 

93,75 

93,75 

93,75 

81,25 

87,50 

87,50 

87,50 

87,50 

Hay-windrowed 

479 

94,36 

93,53 

91,65 

59,71 

60,54 

55,74 

92,28 

95,82 

Oats 

10 

100 

100 

100 

80 

80 

90 

100 

100 

Soybean-no till 

958 

49,16 

49,69 

49,58 

27,56 

34,66 

43,01 

55,22 

68,06 

Soybean-min till 

2458 

48,58 

48,17 

48,70 

45,77 

45,36 

42,35 

52,73 

59,72 

Soybean-clean till 

604 

70,03 

69,04 

66,23 

40,89 

30,79 

39,57 

50,83 

62,42 

Wheat 

202 

95,05 

95,54 

95,05 

85,15 

90,10 

86,63 

98,51 

99,01 

Woods 

1284 

63,16 

72,98 

75,86 

69,55 

63,24 

64,49 

62,31 

62,69 

Bldg-Grass-Trees-Drives 

370 

70,27 

71,62 

71,35 

73,78 

63,24 

54,59 

45,68 

65,95 

Stone-Steel-Towers 

85 

100 

100 

97,65 

98,82 

98,82 

100 

98,82 

98,82 

OA (%) 


70,83 

71,5 

73,12 

69,02 

65,16 

67,62 

78,79 

83,87 

AA (%) 


68,54 

68,91 

70,15 

55,56 

55,42 

55,23 

71,64 

78,31 

kappa 


0,574 

0,5944 

0,6187 

0,475 

0,4657 

0,4662 

0,78 

0,8385 



Ground Truth 



Without oversampling 78,79% 


■ Coin-noil LI 


I vifjirj 
I Com-nonLI 
Icoin-uuiLkll 
I Corn 

Grass-trees 

Grass- pasture flowed | 


Oats 

SovIptan-mJtill 
Soytean- Quui lII 

I "ATj ? jc 

I Wood* 

I Blde.Graii.THei. 
Drives 


With oversampling 82,87% 



Figure 8: Classification maps obtained for the AVIRIS Indian Pines scene 
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4. Conclusion 

In this paper, we have presented a new spectral-spatial 
classification approach which combines various spectral 
and spatial features via multi-feature kernels and generate 
new training samples to solve two problems widely 
proposed in the classification of hyperspectral images 
which are the difficulty of the choice of the applied 
characterization methods and the availability of a limited 
number of labeled samples. It investigates the 
oversampling based on interpolation techniques to 
increase the size of training set. By using the kernel 
a j^pcA+icA+Neigh+Text+EMAP w hi c h valorize the spatial 

features computed by EMAP and by adopting the 
Lagrange interpolation for features extracted by PCA, 
ICA, the average of neighborhood pixels and EMAP and 
the linear interpolation for textural features in the 
oversampling step, the proposed method provides good 
accuracies when compared with the spectral 
classification, the spatial classification and the 
classification without oversampling. Combining multi- 
feature kernels and oversampling provides competitive 
and encouraging results. Further work should be focused 
on the exploitation of active learning algorithms to 
improve the quality of the generated samples in the 
oversampling step. 
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